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ABSTRACT 

We find a highly significant correlation of Type la supernova magnitudes in the 
Union 2.1 compilation of 580 sources. The correlation of magnitude residuals relative 
to the ACDM model and color x redshift has a significance equivalent to 13 standard 
deviations, as evaluated by randomly shuffling the data. We generalize the standard 
B — V color correction to include a Taylor series in redshift z. The goodness of fit x 2 de- 
creases by more than 50 units using one additional parameter linear in color x redshift. 
The new parameter shifts the supernova best-fit cosmological dark energy density pa- 
rameter from Oa = 0.71 ± 0.02 to Oa = 0.74 ± 0.02 assuming a flat universe. Varying 
Cl m and Oa separately produces Cl m + Qa = 1 within errors. The color — redshift 
correlation is quite robust, cannot be attributed to outliers, and passes several tests in- 
dicating it does not originate in data selection or systematic error assignments. One 
physical interpretation is that supernovae or their environments evolve significantly 
with increasing redshift. The previously known rule that bluer supernovae have larger 
absolute luminosity tends to flatten out observationally with increasing redshift. 

Subject headings: Supernovae: general; cosmology: dark matter;cosmology:dark en- 
ergy; ; cosmologyxosmological parameters 

1. Introduction 

Observations of Type la supernovae provide evidence for an accelerating expansion of the 
universe and dark energy. Supernovae are imperfect standard candles, and the corrections for 
their intrinsic luminosity has evolved over time. Philips (Phillips 1993) early observed the impor- 
tant law relating absolute magnitudes and time scales or "stretch factors." In 1998 Tripp discov- 
ered a color correction parameter which greatly improved the model fit for the 29 distant Type 
la supernovae available at the time(Tripp 1998) . The sense of Tripp's correction is that type la 
supernovae with bluer colors tend to be intrinsically brighter. Currently several parameters of 
magnitude, stretch, color, galactic environment, etc. are fit alongside the cosmological parameters 
of dark matter and dark energy density. Global fits couple all the parameters, so that the parame- 
ters correcting absolute magnitudes directly affect the dark matter and dark energy density. 
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In the Union 2.1 compilation (Suzuki et al. 2012) the distance modulus ]i% corrected for color 
(c), stretch (xi) and a certain probability P m = P(m'l osl: < m J heshold ) of the host galaxy is 

]i b {dl, 8, 5, M B ) = m B + ax 1 - Bc + 5P m — Mb- (1) 

Here x\ = s — s, where s is the time stretch factor after redshift (z) corrections have been applied 
(Goldhaber et al. 2001; Guy et al. 2007). Symbol c = color — color, where color = (B — V) max + 
0.057 and the Johnson-Cousins B stands for blue and V the visual magnitude. Overbars denote 
mean values. We have also subtracted the mean from P m to remove a degeneracy affecting the 
value of Mb . 

To test whether supernovae magnitude relations depend on redshift, we generalize the ex- 
isting parameters to a Taylor series expansion in powers of 1 + z. We then re-fit the data to the 
standard ACDM Friedman-Lemaiture-Robertson-Walker (FLRW) framework. The color correc- 
tion parameter B received our main attention due to its astrophysical interpretation. Our general- 
ization is 

Bc^ B(z)c = Boc + B^cz-cz). (2) 

Our study reveals a correlation of color parameters with redshift of very high statistical sig- 
nificance. The residuals of Union 2.1 data relative to ACDM cosmology are correlated with c x z 
at the level equivalent to a 13 a Gaussian fluctuation. Introducing a single parameter B\ and fitting 
the magnitudes of the 580 Snla to a ACDM cosmology reduces the % 2 value by more than 50 units 
compared to previous fits assuming B\ = 0. For reference we call this empirical correlation the 
"color — redshift effect." 



1.1. Definitions 

Conventional analysis assumes a ACDM cosmology with zero radiation density, dark en- 
ergy density Qa and dark matter density Cl m . The model's predicted luminosity distance is 
(Weinberg 2008. p38-55) 

HoVOjt V J\+zX 2 H(x)J 

where H(x) = ^ Cl A x-^+™) + Q m x- 3 , (3) 

with w regulating the dark energy equation of state and Ho the Hubble constant (here 70fcms _1 Mpc _1 ). 
The flat cosmology has Cl k = 1 — Q m — Qa = 0. We restrict study here to the standard value 
w = — 1. Our preliminary studies find no great sensitivity between the new effects and w. 

The distance modulus ]i mo del is defined by 

]i moM {z,VL h ,VL m ) = 51og 10 (^^-) +25. (4) 
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1.2. Data and Analysis 

We actually began with the 2008 Union compilation of Kowalski et a/. (Kowalski et al. 2008). 
By interpreting the cuts we were able to obtain the 307 SNla reported as the 3a set. Yet the 
final data tables of this compilation gave only raw magnitude uncertainties, in which extensive 
corrections for systematic errors were not included. We found a large color — redshift correlation, 
and explored a fit using the raw magnitude uncertainties. One color x redshift parameter f>\ 
yielded more than 30 units of x 2 improvement. We then studied the Union 2.0 compilation of 
Amanullah et al. (Amanullah et al. 2010), which gave both raw and total uncertainties, while it 
does not use the 8 parameter of Union 2.1. This paper, like (Kowalski et al. 2008), lacked sufficient 
detail for all of its results to be reproducible by us. Nevertheless the improvement of x 2 per 
data point using f>\ and the reported uncertainties was found to be quite comparable. Finally 
the Union 2.1 compilation of (Suzuki et al. 2012) is one of the largest and most recent collections, 
which subsumes many of the earlier Snla compilations. The publication gave enough detail to 
reproduce its x 2 /df using its best-fit parameters oc, jS, 8, Cl m , Mb and w = — 1. Our results here are 
confined to the "3a set" of Union 2.1 data passing certain quality cuts described below. 

In all of our studies the processes of data reversion and analysis calculations were done twice, 
by independently written programs that compared outputs while sharing no common elements 
of code. 



2. Correlation Results 

We first fit redshift-independent parameters ccq, fio, 8q, Mb, and Cl m to represent the standard 
ACDM model with equation of state parameter w = — 1 and Qa = 1 — Cl m . Results are shown in 
the top line of Table 1. 

We will first discuss residuals 8 }l , which are given by the differences between the distance 
moduli and the model: 

8 H =m B - Vmodeb (5) 

The residuals of the fit versus color x redshift are shown as a scatter plot in Figure 1. The correla- 
tion is readily seen by eye. 

The Pearson coefficient r quantifies the correlation. It is defined by 

/ \ Li(xi - x){yi -y) 

r(x,y) = _ i (6) 

vEi(*/-*)VEi(y»-y) 2 

We find r SN = r{8 }l , c z) = —0.52. The significance of r$N was estimated by comparing a Monte- 
Carlo simulation using the data itself. The simulation randomly shuffles the (color x redshift) data 
elements, re-calculates a value r ran ^ omi and saves it. Figure 2 shows the histogram of r ran ^ om from 
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Fig. 1. — Distance modulus residuals Sp = nig — }i mo del versus color x redshift from the best-fit ACDM 
model of the Union 2.1 data compilation with a flat universe and w = —1. The Pearson correlation of 
r — —0.52 has the significance of a 13c effect, as evaluated by a simulation shuffling the data randomly. 
Residuals are defined by Sp = nig — }i mo dd- Seven points discussed in the text are indicated by circles. 
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Fig. 2. — Histo gram of random Pearson correlations ^random obtained by shuffling i^coloy x vedshift) pa- 
rameters of the data relative to its residuals. The value r$N computed with the data is about 13u from the 
mean. The curve shows a Gaussian distribution with mean f rar ,dom = —0.00067 and standard deviation 

^r— random 0.042. 



10,000 runs. The mean and standard deviation of the random correlations are f random = —0.00067 
and <r r -random = 0.042. The data's correlation of rgw is about l3P r -random from the mean of the 
random correlations. The estimated P- value (of order 10~ 39 ) is too small to simulate or interpret 
as a fluctuation. 

One might ask whether the correlation is dominated by a few outliers. Actually the Union 2.1 
set we are using has already rejected outliers. Starting with 753 Snla in the full compilation, points 
further than 3c from the model prediction were discarded, leaving the 580 points we use. The cuts 
also include (1) that the CMB-centric redshift is greater than 0.015; (2) there must be at least one 
point between -15 and 6 rest-frame days from B-band maximum light; (3) At least five valid data 
points must exist; (4) the entire 68% confidence interval for X\ must lie between -5 and +5; (5) data 
must come from at least two bands with rest-frame central wavelength coverage between 2900 
A and 7000 A ; (6) at least one band must be redder than rest-frame U-band (4000 A ). As an 
experiment we explored removing the seven points with largest \5^ t | and indicated with circles in 
Figure 1. That reduced the correlation to rsN,cut = —0.45, an lie effect. 
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2.1. A Model with One New Parameter: 

We define our fit statistic x 2 as 

SNe rf' 

with (Tp being the complete (statistical and systematic) errors provided in the full table on the SCP 
website 1 . With standard, redshift-independent color correction the best fit value is x 2 — 550 for 
580 — 5 degrees of freedom (df) (Table 1). Since nothing in our study depends on a fraction of 
a unit of x 2 we round the values to the nearest integer. Allowing for redshift-dependent color 
correction, f>c — > foe + fa(cz — cz) finds a best fit with x 2 = 500 with (580 — 6)df. That is a 50 
unit improvement in the fit from the addition of one parameter fa: see Table 1. Note the fa ^ 
model is smoothly connected to the formal"null hypothesis" that no color-redshift correlation 
exists. Then Wilks' Theorem predicts Ax 2 is distributed by x\ i n trie nuii / f° r which a 50 unit 
fluctuation has a chance probability of order 10~ 12 . 

Letting Qm and Oa vary independently we find Q m + Qa = 1 within errors. Compared to 
assuming a flat universe a priori the goodness of fit is improved less than a unit of x 2 by letting 
Oa be a free parameter. Alternatively, one cannot improve the fit by adding = 1 — — Qa 
as a parameter. 

Achieving such a low x 2 /df would be unusual on a statistical basis. However there is a 
simple explanation. In developing systematic errors there is a step adjusting the reduced X 2 /df of 
each sample to unity. (See (Suzuki et al. 2012) following Eq. 7). We used the errors published and 
made no sample-by-sample corrections, but doing so after fa is fit would trivially bring x 2 /df 
back to one. There is a question of whether the method of assigning systematic errors, which is 
iterative and tuned to the model, might have a role in the color — redshift effect. That is impossible 
for us to resolve. However Section 3 discusses a simple unbiased procedure that gives evidence 
the assignment of systematic errors should not be a decisive feature of our findings. In passing we 
note that the constant Mb is not independently observed in simple Snla fits, where its effects are 
degenerate with the Hubble parameter. Related to this, the traditional absolute magnitude Mg has 
been called a "nuisance parameter", and many papers have come to omitting the value obtained 
from their fits. We do not agree that Mg or the other parameters are devoid of physical meaning, 
and Table 1 reports all parameters needed to reproduce our results. 

Figures 3 and 4 show x 2 versus Q m and Qa with and without accounting for the color x 
redshift effect. The plots are made with parameters olq fa, 5q, Mb evaluated at their best-fit val- 
ues point by point and Q ra + Qa = 1- The important cosmological parameters Q m and Qa are 
sensitive to value of fa. The significance of the color — redshift effect for Q m and Qa depends 
on how it is computed and assessed. For example, a similar plot with the other parameters fixed 



1 The full Union 2.1 data files are available at http : / / supernova.lbl.gov 
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Table 1: Comparison of fits without accounting for the color — redshift effect (f>i = 0) and including it 
(/Si 7^ 0). Parameters held fixed are indicated by an asterisk. The color — redshift effect produces a highly 
significant improvement of the fit even with the unphysical constraint Cl m — Cl A — 0. x„i„ an d A^ 2 have 
been rounded to the nearest whole number. 



to the global best fit value (Mohlabeng & Ralston 2012) finds that Cl m and Qa shift by more than 
their 99.95% (3 a) confidence level uncertainties. That is appropriate when other parameters are 
known. The plots here showing x 2 with other parameters floating to their best-fit values are more 
conservative, and based on the supernova data alone. (For example, no further information from 
CMB or galaxy distributions has been assumed. It would be interesting to explore the effects of 
joint fits.) The errors in the Table 1 come from the 6 x 6 or 7 x 7 covariance matrices using the 
convention 2 of Ax 2 = 1. 



2.2.2. Other Redshift-Dependent Parameters: 

Random searches will dilute statistics, and we did not make many before finding the color — 
redshift effect. After we found its significance was high we considered whether other parameters 
might be a cause or contributing factor. The fits involve integrations and are computationally 
slow. Thus we found it more orderly (if not exhaustive) to fix parameters and look for redshift 
dependence of other parameters sector by sector. The variations and their results are: 

(fix fa = Si = 0) : a -> oc + «iz; a x = -0.036; A^ 2 = 3.8; 

(fix ai = jSi = 0) : 5 -> S + hz; h = -0.093; Ax 2 = 2. (8) 

The a-posteriori justification for varying parameters one at a time is that none but f>\ produce very 
significant effects. 

We considered whether a z-dependent stretch correction ol\ might be redundant. The de- 
tails of extracting X\ from case-by-case fits are not available to us, which might create a danger of 
double-counting. Similarly the systematic errors used in x 2 fit were determined by complicated 



2 Multi-parameter confidence levels can be assessed using several conventions. The Ax 2 = 1 convention follows the 
text of Ref. (Suzuki et al. 2012) and its Table 7. 
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Fig. 3. — Left panel: x versus O m without color x redshift parameter (/3i = 0, upper points) and including 
it (j6i = —1.6, lower points). Right panel: Same as left with Qy\ on the x-axis. Fits use Cl m + D.j\ — 1 and 
best-fit values of the remaining parameters point-by-point. 

procedures that involve the FLRW model and its z dependence, producing hidden redshift depen- 
dence in analysis. Once again we refer to the somewhat crude systematic error study presented in 
Section 3 



3. Discussion 



The estimated statistical significance of the color x redshift correlation is sufficiently high 
that ordinary confidence levels fail to express it. The correlation is very robust and too large to be 
attributed to outliers. 

We mentioned one test in Section 1.2. For an independent test we computed the differ- 
ence Ax 2 {N) = X 2 {fti = 0, N) — X 2 {Pi' N), where N points were selected on the basis of their 
uncertainty-weighted residuals relative to the /3i = fit of the full set, as follows. The data was 
sorted in order of decreasing S^/af > ^ 2 /°2 > — ^« n^°n- Rejecting the first N points and 
comparing f>\ = with f>\ ^ produces A^ 2 (N). Thus A^ 2 (3) omits the three largest weighted 
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Fig. 4. — Left Panel: Contours of constant x m the i^m, Pi) plane. Contours start at Xmin + 1 = 500 
and are separated by one unit, except for the outermost contour, corresponding to x 2 — 550 that intersects 
(Pi = 0, Cl m = 0.29), as indicated by the small arrow. Right panel: Same as left using the (Ha, Pi) plane 
with the outermost intersecting at (fix = 0, Oa = 0.71). Fits use Cl m + Oa = 1 and best-fit values of the 
remaining parameters point-by-point. 



residuals, and so on. This procedure is statistically unfair, and strongly biased in favor of the 
Pi = hypothesis, because at each N it selects the data to maximally confirm the hypothe- 
sis. On the other hand, if outliers were causing the correlation we observe, then we would ex- 
pect to see Ax 2 (N) suddenly decrease to zero for some N. Instead Ax 2 (N) decreased smoothly 
with N from A^ 2 (0) = 51 to Ax 2 (50) = 20. The computation was stopped at N = 50, where 
X 2 {Pi = 0, N = 50) / df has been artificially decreased by rejecting points from about 550/ (550-6) 
to about 300/ (500-6), a radical reduction. 

Two classes of questions naturally arise: 



• Data Selection and Processing: It would be interesting to know whether the effect might 
possibly hinge on details of data selection or systematic errors. We have limited information, 
and our studies are naturally restricted to the compilation as published. Selection effects 
certainly pose a non-trivial question. (Amanullah et al. 2010) give an extensive discussion 
of the fact that intrinsically brighter supernovae tend to be selected at large z. Yet if the usual 
color and stretch corrections would describe the sources, we cannot see how selection on the 
population would produce a false c x z correlation or a signal in a likelihood (x 2 ) test. More 
subtle issues would best be pursued by those who control the original data. Regarding data 
selection on quality basis, powerful consistency checks have already been presented in the 
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references of (Kowalski et al. 2008; Amanullah et al. 2010; Suzuki et al. 2012). For example, 
the distributions and correlations of the residuals with olq (5q, So, Mg,z from both the full 
compilations as well as subdivided in the Union studies were shown to be within statistical 
expectations. 

The Union 2.1 systematic errors were produced with iterations with the model that was 
fit. That leads to a question whether our revised model might exploit some anomaly in the 
systematic errors. Ruling out this possibility explains the relevance of our correlation r SN 
that included no errors. This is also the motivation for repeating our fits with raw magnitude 
errors, which at least ought to be unbiased. We considered introducing additional terms 
into the systematic errors to weaken the color — redshift effect, but soon realized it would be 
trivial and irresponsible to make it go away by design. Instead we explored a simple proxy 
for systematic errors which padded the systematic magnitude errors with a parameter £ by 
the rule (Am B ) 2 —> (A^mg) 2 = (Am B ) 2 + £ 2 . The procedure tests the possibility that points 
of small error might dominate the fit and skew the results. By not attempting to adjust errors 
to fit the FLRW model, the padded magnitude uncertainties also tend toward an unbiased 
procedure. The recomputed best-fits comparing /3i ^ and f>\ = yielded a smooth and 
nearly monotonic variation of A^ 2 = 51 (£ = 0.001) to A^ 2 = 35.5 (£ = 0.2). The range of £ 
spans the differences of Amg raw ~ 0.08 to A^/g systematic ~ 

0.22. The test does not support the 
possibility systematic error assignments might cause the correlation. 

• Physical Interpretation: Assuming the color — redshift effect has an astrophysical origin, 
we use the following facts to interpret it. When introducing his color correction parame- 
ter Tripp (1998) wrote that "To accommodate different amounts of reddening observed in 
the 29 supernovae of Hamuy et al. (1996), arising either from an intrinsically reddened super- 
nova or from intervening dust in the parent galaxy, we introduce... another phenomenologi- 
cal parameter R." The R parameter is essentially f>Q. Continuing, Tripp wrote that " by 
applying the same type of color correction to cosmological supernovae even without know- 
ing whether reddening is intrinsic or due to dust, one will be able to completely standardize 
the light output of each explosion and thereby get substantially better values for qo and 
the mass density Q m of the universe". (Italics are ours.) Tripp's presentation cited van 
den Bergh (1995), who noticed that model calculations showed a bluer-brighter correla- 
tion, van den Bergh suggested an effective magnitude parameter which to a first approx- 
imation would be independent of both reddening and the supernova model. Even earlier 
Branch and Tammann (1992) had noticed a puzzle that R found in data was much smaller 
than expected from models and general considerations involving dust. Besides the work 
of (Tripp 1998; Tripp & Branch 1999) the need for color corrections in modern SNe analysis 
was noted by (Riess Press & Kirshner 1996; Guy et al. 2007) and appears in all Union com- 
pilations. 

In our generalization of f> — > fio + fiiz, a positive value of f>\ would increase the effect with 
increasing z, as expected from accumulating effects of dust. Instead we find f>\ < 0, which 
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we cannot understand would be caused by increasing amounts of dust with distance. The 
color — redshift effect is, however, consistent with evolution of the sources. This may be 
related to the long-standing puzzle highlighted by Branch and Tammann (1992) and others 
that jS from data fits is smaller than expected. If the color — redshift effect is not taken into 
account, its trend will generally decrease the value found in a one constant (jS or R) model 
that is fit over a range of redshifts. As consistent, our z ~ parameter /3o = 2.62 is somewhat 
larger than the value found setting fi\ = 0. The general sense of the correction from /Si is 
this: modifying the known rule that bluer sources are brighter, the proportionality tends to 
decrease with larger z. Extrapolating naively to large enough z, the bluer-brighter relation 
would eventually reverse. Since reversal is implausible, it appears the bluer-brighter relation 
probably saturates at large z, close to the largest redshifts observed so far. Whether this is a 
fact of supernovae or how the supernovae are observed is unknown. 

We conclude that the color — redshift effect is evidence for significant evolution of Type la 
supernovae or modifications of their environments with increasing redshift. Other explanations 
may exist. It seems premature to attempt the last word on the highly significant trend we have 
found. 
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